PEC2
Javier Advani
In [2]:
import pandas as pd
# fuente: https://www.kaggle.com/datasets/ashydv/housing-dataset/code
df = pd.read_csv('Housing.csv')
df.head()
Out[2]:
price area bedrooms bathrooms stories mainroad guestroom basement hotwaterheating airconditioning parking prefarea furnishingstatus
0 13300000 7420 4 2 3 yes no no no yes 2 yes furnished
1 12250000 8960 4 4 4 yes no no no yes 3 no furnished
2 12250000 9960 3 2 2 yes no yes no no 2 yes semi-furnished
3 12215000 7500 4 2 2 yes no yes no yes 3 yes furnished
4 11410000 7420 4 1 2 yes yes yes no yes 2 no furnished

A Bar graph is used when you want to show a distribution of categorical variables or ordinal subgroups of your data. From a bar chart, we can see which groups are highest or most common, and how other groups compare against the others.

In [5]:
# Bar chart

import seaborn as sns
import matplotlib.pyplot as plt

sns.set(style="darkgrid")
total = float(len(df))
ax = sns.countplot(x="bedrooms", data=df)
for p in ax.patches:
    percentage = '{:.1f}%'.format(100 * p.get_height()/total)
    x = p.get_x() + p.get_width()
    y = p.get_height()
    ax.annotate(percentage, (x, y),ha='center')
ax.set(title="Number of Bedrooms in Houses", ylabel="%", xlabel= "nº bedrooms")
plt.show()

A Small Multiple is a data visualization that consists of multiple charts arranged in a grid. This makes it easy to compare the entirety of the data.

In [9]:
# Small Multiple
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
 
df=pd.DataFrame({'x': range(1,11), 'y1': np.random.randn(10), 'y2': np.random.randn(10)+range(1,11), 'y3': np.random.randn(10)+range(11,21), 'y4': np.random.randn(10)+range(6,16), 'y5': np.random.randn(10)+range(4,14)+(0,0,0,0,0,0,0,-3,-8,-6), 'y6': np.random.randn(10)+range(2,12), 'y7': np.random.randn(10)+range(5,15), 'y8': np.random.randn(10)+range(4,14), 'y9': np.random.randn(10)+range(4,14) })
palette = plt.get_cmap('Set1')
num=0
for column in df.drop('x', axis=1):
    num+=1
    plt.subplot(3,3, num)
    plt.plot(df['x'], df[column], marker='', color=palette(num), linewidth=1.9, alpha=0.9, label=column)
 
    # Same limits for every chart
    plt.xlim(0,10)
    plt.ylim(-2,22)
 
    # Not ticks everywhere
    if num in range(7) :
        plt.tick_params(labelbottom='off')
    if num not in [1,4,7] :
        plt.tick_params(labelleft='off')
 
    # Add title
    plt.title(column, loc='left', fontsize=12, fontweight=0, color=palette(num) )

# general title
plt.suptitle("How top 9 cryptos evolved\nthese past few days?", fontsize=13, fontweight=0, color='black', style='italic', y=1.02)
 
# Axis titles
plt.text(0.5, 0.02, 'Time', ha='center', va='center')
plt.text(0.06, 0.5, 'Note', ha='center', va='center', rotation='vertical')

# Show the graph
plt.show()
In [1]:
import pandas as pd
import holoviews as hv
from holoviews import opts, dim
from bokeh.sampledata.les_mis import data

hv.extension('bokeh')
hv.output(size=200)

The Chord element allows representing the inter-relationships between data points in a graph. The nodes are arranged radially around a circle with the relationships between the data points drawn as arcs (or chords) connecting the nodes. The number of chords is scaled by a weight declared as a value dimension on the Chord element.

If the weight values are integers, they define the number of chords to be drawn between the source and target nodes directly. If the weights are floating point values, they are normalized to a default of 500 chords, which are divided up among the edges. Any non-zero weight will be assigned at least one chord.

The Chord element is a type of Graph element and shares the same constructor. The most basic constructor accepts a columnar dataset of the source and target nodes and an optional value. Here we supply a dataframe containing the number of dialogues between characters of the Les Misérables musical. The data contains source and target node indices and an associated value column:

In [2]:
links = pd.DataFrame(data['links'])
print(links.head(3))
   source  target  value
0       1       0      1
1       2       0      8
2       3       0     10

In the simplest case we can construct the Chord by passing it just the edges

In [3]:
hv.Chord(links)
Out[3]:

The plot automatically adds hover and tap support, letting us reveal the connections of each node.

To add node labels and other information we can construct a Dataset with a key dimension of node indices.

Additionally we can now color the nodes and edges by their index and add some labels. The labels, node_color and edge_color options allow us to reference dimension values by name.